Class-based Word Sense Induction for dot-type nominals
نویسندگان
چکیده
This paper describes an effort to capture the sense alternation of dot-type nominals using Word Sense Induction (WSI). We propose dot-type nominals generate more semantically consistent groupings when clustered into more than two clusters, accounting for literal, metonymic and underspecified senses. Using a class-based approach, we replace individual lemmas with a placeholder representing the entire dot type, which also compensates for data sparsity. Although the distributional evidence does not motivate an individual cluster for each sense, we discuss how our results empirically support theoretical proposals regarding dot types.
منابع مشابه
Detecting Compositionality in Multi-Word Expressions
Identifying whether a multi-word expression (MWE) is compositional or not is important for numerous NLP applications. Sense induction can partition the context of MWEs into semantic uses and therefore aid in deciding compositionality. We propose an unsupervised system to explore this hypothesis on compound nominals, proper names and adjective-noun constructions, and evaluate the contribution of...
متن کاملDetecting selectional behavior of complex types in text
In this paper, we discuss some aspects of selectional behavior of dot objects, and present an algorithm for clustering selector contexts for dot nominals according to the selected type. The clustering algorithm is based on the notion of contextualized similarity between selector contexts and defines a similarity measure for contextual equivalents of the target nominal.
متن کاملMaxMax: A Graph-Based Soft Clustering Algorithm Applied to Word Sense Induction
This paper introduces a linear time graph-based soft clustering algorithm. The algorithm applies a simple idea: given a graph, vertex pairs are assigned to the same cluster if either vertex has maximal affinity to the other. Clusters of varying size, shape, and density are found automatically making the algorithm suited to tasks such Word Sense Induction (WSI), where the number of classes is un...
متن کاملWord Sense Induction and Disambiguation Rivaling Supervised Methods
Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context and successful approaches are known to benefit many applications in Natural Language Processing. Although, supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words...
متن کاملStructured Generative Models of Continuous Features for Word Sense Induction
We propose a structured generative latent variable model that integrates information from multiple contextual representations for Word Sense Induction. Our approach jointly models global lexical, local lexical and dependency syntactic context. Each context type is associated with a latent variable and the three types of variables share a hierarchical structure. We use skip-gram based word and d...
متن کامل